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(57) Abstract: A method and apparatus for virtual channel flow control at the link level, in which the virtual channel allocation is 
based on DestinationlD. At each hop, cells destined for a particular destination are only allowed to occupy a part of the total available 
receiver buffer space. This flow control enables receiver cell buffer sharing, while maintaining per channel (per connection) band- 
width and loss-less eel! transmission. A higher and more efficient utilization of receiver is achieved. In addition the virtual channel 
flow conirol method and apparatus described improve latency characteristics by making the virtual channel flow control more pre- 
dictable, and thus provide a method for congestion control. At last the present invention implicitly addresses: Injection rate control; 
Failured network components (e.g. Host Adapters/IO-subsystems/Bridges/Switches/Routers/etc.). Both the above problems cause 
network buffers to be filled up and may lead to watchdog time-out at the transmitter. Watchdog time-out leads to retransmission, 
which causes performance degradation of the network. 
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Virtual Channel Flow Control 

FIELD OF THE INVENTION 

The application relates to a method and an apparatus for virtual channel 
flow control at the link level in a communication network. The application also 
relates to uses of the method and apparatus. 

BACKGROUND OF THE INVENTION 

Traditional data-communication networks are usually designed so as to 
operate with reasonable efficiency when the traffic load presented by its sources 
does not exceed a certain limit. If the network load exceeds this limit, a pheno- 
menon often referred to as throughput collapse occurs: The producers deliver an 
increasing amount of traffic to the network, while the network actually delivers a 
decreasing amount of traffic to the consumers. The result is lower performance, 
unpredictable forward progress, and decreasing consumer input capacity. These 
effects are highly undesirable in a System Area Network (SAN). A SAN is an inter- 
connect used for inter-processor (or inter-computer) communication (IPC), and a 
computer-to-IO interconnect. 

Congestion is often used as a synonym for throughput collapse, but it will 
here be referred to as the state in which the traffic load presented to the network 
by its sources approaches or exceeds the maximum network throughput capacity. 
Congestion tolerance is important to all high-speed distributed computer systems. 
Such networks have to cope with large mismatches in throughput (e.g. high- 
throughput producer vs. low-throughput consumer), bursty traffic which often cre- 
ates hot-spots, and load unpredictability ('all-to-all-at-any time' traffic patterns). 

There are basically two main reasons for throughput collapse: packet drop- 
ping/retransmission and head-of-line (HOL) blocking. Packet dropping/retransmis- 
sion occurs when the network buffers are filled faster than they are emptied. If 
there is no flow control to stop the packet transmission, packets arriving to full buf- 
fers have to be dropped. In a congested system packet dropping/retransmission 
easily becomes a regenerative phenomenon. 

Flow control prevents packets from being dropped. However, retransmis- 
sion still occurs if the latency introduced by the network is higher than the packet 
watchdog time-out in the hosts and/or IO subsystem. The second cause of 
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throughput collapse is HOL block, which is easy to explain with input queuing (i.e. 
packets are buffered in a FIFO at the input port of a switch). If the first packet in 
the FIFO cannot be sent due to congestion, this packet will block the other packets 
in the FIFO (i.e. head-of-line). The result is livelocks and retransmission. 

Flow control 

Contemporary high-performance cell-based point-to-point interconnects use 
some sort of link-level buffer flow control to provide lossless cell transmission. This 
is often referred to as hop-by-hop flow control, or back-pressure flow control. 
There exist three well-known implementations of hop-by-hop flow control. A brief 
discussion of them all follows: 

• X-on/X-off flow control 

• The transmitter keeps sending packets until it receives a x-off flow control 
token from the receiver. At that point the transmitter halts all transmission. 
Transmission is again re-enabled when it receives a x-on flow control token. 
The receiver transmits x-off when its buffers are close to being filled. The 
receiver transmits x-on as soon as buffer space is available. 

• Credit-based flow control ([4]) 

• Packets are only transmitted when receiver buffer space is known to exist. 
To keep track of such buffer space, a credit counter is maintained, which is 
decremented when a packet departures, and incremented when credit 
tokens are received (from the downstream neighbor (i.e. receiver)). Credit 
tokens are sent back (by the downstream neighbor (receiver) to the up- 
stream node (transmitter) when buffer space becomes available. 

• Retry-based flow control 

• A rather opposite, although similar scheme, is used by SCI [8]. This proto- 
col is referred to as the 'A/B retry' protocol: The receiver accepts all incom- 
ing packets until its buffers are full, when it switches state to only accept 
previously retried packets. When all retried packets finally are accepted, the 
receiver switches state to accept new packets, etc. 

The main difference between the schemes shows up in a heavily congested 
system: In a system with xon/xoff or credit-based flow control there will be no link 
traffic at all, while the retry scheme used by SCI fills up the link with retries (retried 
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packets waiting to be accepted), if the receiver buffer is indiscriminately shared by 
traffic going to all different destinations, all the above flow control methods are re- 
ferred to as single-lane flow control. The problem with single-lane flow control is 
analogous to HOL-blocking: Data going to congested destinations accumulate 
buffers, hence blocking packets destined elsewhere from proceeding at full speed. 
An analogy in everyday life is the single-lane streets: cars waiting to turn left block 
cars headed straight. 

Virtual Channel Flow Control 

HOL-blocking occurring due to single lane flow control can be overcome by 
use of virtual channel flow control or multi-lane flow control, as described in [2], A 
virtual channel consists of a buffer that can hold one or more packets, and some 
state information. Several virtual channels share the bandwidth of a single physical 
channel. Virtual channels decouple allocation of buffers from allocation of chan- 
nels by providing multiple buffers for each channel in the network. Thus a cell B 
can pass blocked cell A if B belongs to a different channel. 

Ideally separate buffer space is required for each connection at each hop. 
The receiver buffer space per connection must be in proportion to this connec- 
tion's peak throughput times the round-trip time, to allow each connection to pro- 
ceed at full speed. This static buffer allocation ensures complete independencies 
of each connection from all others, at the cost of a large number of buffers, which 
makes it impractical to implement in an ASIC (Application Specific Integrated 
Circuit) and/or a FPGA/PLD (Field-Programmable Gate Array/Programmable 
Logic Device). 

A refined solution is to partition into flow groups: A flow group, at each point 
in the network, is a set of connections that have a common destination and a com- 
mon channel to it. Hence all members of a flow group can be flow controlled to- 
gether. 

To reduce the required buffer space even further various schemes of dyna- 
mically shared memory between the flow groups have been proposed. A simple 
scheme addressing this is shown in Figure 1. Figure 1 shows a transmitter 10 
sending a packet 14 to a receiver 11. The receiver 11 has a buffer 12 with B 
buffers (0.1...B-1). The buffer space B in the receiver 11 is shared among F flow 
groups flowGr. At most b packets 14 of a given flow-group (flowGr) is allowed in 
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the buffer 12 at once. The number of different flow groups that can fit into the 
buffer 12 at once is L = B/b, where L is the number of lanes and b the number of 
packets in a given flow group. Separate credits fgCr are given for each flow group. 
poolCr 13 is a credit count used to not overflow the buffer 12. A packet i depar- 
tures only if 

fgCr[i] > 0 and poolCr > 0. 
When packet i departs, the credit counts fgCr[i] and poolCr are decremented by 
one, and when a credit i is received, the credit counts fgCr[i] and poolCr are incre- 
mented by one. 

However, all the prior art implementations of virtual channel flow control 
with dynamically shared memory suffer from some defects. At first, they are based 
on credit-based flow control, which does not make them general in the sense that 
they also can be applied on x-on/x-off and/or retry-based flow control 
([3],[5],[6],[7]). At second, some presume non-lossless cell transmission in the 
case of heavy congestion ([6]). Although this may be acceptable in a LAN/WAN, 
it's certainly not acceptable in a SAN. At third, most of them are based on the re- 
quirement of a "descriptor" block per virtual channel (per connection), where the 
descriptor block contains various counters and registers. This solution is described 
in [1]. This leads to a big amount of logic needed per hop, which introduce a gene- 
ral scalability problem. At last, the prior art do not provide a protection against con- 
gestion as a result of either a failured network component, or as the result of a 
high-performance link going into a low- performance link. 

The object of the invention is to provide a solution to the problems presen- 
ted above. 

SUMMARY OF THE INVENTION 

In accordance with a first aspect the present invention provides a method 
for virtual channel flow control at the link level in a communication network, the 
network comprising at least one communication link having a transmitter end and 
a receiver end, a transmitter at the transmitter end for transmitting data cells over 
the communication link, a receiver at the receiver end for receiving the data cells 
transmitted over the communication link, the receiver including a plurality of 
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buffers for storing the data cells, data cells with the same destination address 
belonging to a same flow group, wherein a flow group is only allowed to occupy a 
part of the available buffer space, the method comprising; 

- transmitting flow control information from the receiver to the transmitter, the 
flow control information comprising receiver buffer state information, and 

- using a data cell scheduler in the transmitter for taking appropriate action de- 
pending on the received flow control information, the scheduler ensuring trans- 
mission fairness between the flow groups. 

In a preferred embodiment of the invention the method comprises determi- 
ning the available buffer space by using a content addressable memory (CAM) 
with N entries arranged in the receiver, each entry containing a valid bit and a de- 
stination address field of the corresponding buffer, the valid bit indicating whether 
the buffer is occupied and hence the validity of the destination address field. The 
content addressable memory may also be utilized for forwarding the information 
regarding available buffer space for a data cell to a flow control processor arrang- 
ed in the receiver, whereby the flow control processor transmits flow control infor- 
mation from the receiver to the transmitter. At least one programmable register ar- 
ranged in the receiver may be used for determining the number of buffers allowed 
for occupancy by each flow group. 

In accordance with a second aspect the present invention provides an 
apparatus for virtual channel flow control at the link level in a communication net- 
work, comprising at least one communication link having a transmitter end and a 
receiver end, a transmitter at the transmitter end for transmitting data cells over 
the communication link, a receiver at the receiver end for receiving the data cells 
transmitted over the communication link, the receiver including a plurality of 
buffers for storing the data cells, data cells with the same destination address be- 
longing to a same flow group, wherein a flow group is only allowed to occupy a 
part of the available buffer space, and a data cell scheduler in the transmitter, the 
scheduler being operative to take appropriate action depending on received flow 
control information from the receiver and for providing transmission fairness be- 
tween the various flow groups, wherein the flow control information comprises re- 
ceiver buffer state information. 

Preferably, the communication links are point-to-point bi-directional com- 
munication links. 
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In a preferred embodiment the receiver includes at least one programmable 
register, the value of the register reflecting/indicating the number of buffers 
allowed for occupancy by each flow group. 

In another preferred embodiment the receiver may have N buffers, where 
each buffer can contain one cell, the receiver further comprising a content ad- 
dressable memory (CAM) with N entries, each entry containing a valid bit and a 
destination address field of the corresponding buffer, the valid bit indicating 
whether the buffer is occupied and hence the validity of the destination address 
field. The receiver may then further include a receiver flow control processor. 

The method and the apparatus defined above can be used for rate control 
of a high-performance link connected to a low-performance link, and also for 
control of congestion resulting from failured network components. 

The method and apparatus for virtual channel flow control at the link level 
described above base the virtual channel allocation on the DestinationID of the 
data cell. At each hop, cells destined for a particular destination is only allowed to 
occupy one part of the total available receiver buffer space. This enables receiver 
cell buffer sharing, while maintaining per channel (per connection) bandwidth with 
lossless cell transmission. A higher and more efficient utilization of receiver is achi- 
eved. In addition the described method and apparatus for virtual channel flow 
control improve latency characteristics for a particular network path by making it 
more predictable. The present invention provides a method for congestion control. 
The present invention addresses implicitly injection rate control, in the case in 
which a high-performance link is connected to a low-performance link. Implicitly, 
the present invention also provides a method for congestion control in a situation 
* of failured network component(s) (e.g. Host Adapters/IO- 

subsystems/Bridges/Switches/Routers etc.). Both the above problems cause 
network buffers to be filled up and may lead to watchdog time-out at the 
transmitter. Watchdog time-out leads to retransmission, which causes 
performance degradation of the network. 

The resultant system has eliminated all defects of the presently known prior 
art. It eliminates the need for a huge amount of logic needed for descriptor blocks, 
while taking advantage of buffer sharing to minimize the buffer requirements at the 
receiver. It also ensures lossless cell transmission. As an additional advantage it 
also provides protection from congestion as a result of failured network compo- 
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nents, or as the result of a high-performance link sending traffic into a low-perfor- 
mance link. 



BRIEF DESCRIPTION OF THE DRAWINGS 

The above and other aspects of the present invention will become apparent 
from the following description read in conjunction with the accompanying drawings 
in which: 

Figure 1 presents a simplified block diagram of virtual channel flow control 
with dynamically shared memory as known in the prior art; 

Figure 2 presents an overview of a general data communication network; 
Figure 3 presents a general-purpose cell; 

Figure 4 illustrates a communication path between two end-nodes, A and B, 
through a network; 

Figure 5 presents a general overview over hop-by-hop flow control; 
Figure 6 illustrates the virtual channel flow control in accordance with the 
present invention; 

Figure 7 presents a detailed block diagram of the receiver according to an 
embodiment of the present invention; 

Figure 8 is a detailed block diagram of the transmitter according to an em- 
bodiment of the present invention; and 

Figure 9 presents a system overview of a data communication network 
where the present invention has been implemented. 

DETAILED DESCRIPTION 

The description of the example embodiments is based on the Scalable 
Coherent Interface (SCI, see [8]) as the underlying mechanism for flow control. 
However, the invention is equally applicable to network systems with other types 
of hop-by-hop link flow control, and the invention is therefore not limited to SCI. 

Figure 2 presents a general purpose data communication network. The net- 
work 20 serves as a communication medium for the nodes attached thereto. Each 
network-attached node 21 uses a point-to-point bi-directional communication link 
22 as the network connectivity medium. Each network-attached node has a unique 
network address, labeled DestinationID in Figure 2. Communication between the 
attached nodes is achieved by sending cells between the nodes. Each cell is 
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equipped with a Destination^, so that the network may route the cell to the correct 
destination (network-attached node) by inspecting the cell's DestinationlD. A gene- 
ral purpose cell is shown in Figure 3. A cell 30 may consist of a header 31 , which 
usually consists of information about the sender/recipient's address (i.e. Desti- 
nationlD 34), followed by a data field 32 (usually referred to as payload), and a cell 
trailer 33, or a cell delimiter, which in the general case typically will be some sort of 
error-detecting code (e.g. CRC (Cyclic-redundancy-check)). 

Figure 4 shows an overview of a network communication path between 
node A 21 and node B 21. A cell transmitted by node A, is routed via switches 40 
on its way to node B. The switches 40 in the network are interconnected by bi- 
directional point-to-point links 22. Hop-by-hop flow control as described earlier is 
applied to each link 22. 

Figure 5 shows a detailed overview of the hop-by-hop flow control. A trans- 
mitter 50 (upstream element) is connected to a receiver 51 (downstream element) 
via a point-to-point bi-directional link 22. Both the transmitter 50 and receiver 51 
are usually part of either a switch 40 or an end-node 21 (See Figure 4). Each re- 
ceiver 51 has a receiver queue (RQ) 52 with N buffers 53, each buffer 53 capable 
of containing one cell 30. Depending on the flow control method in use the trans- 
mitter 50 may also contain one or more transmitter queue(s) (TQ). The flow control 
method used by SCI requires a transmit queue, which will be explained later. 

Whenever the receiver 51 observes that the occupied buffers 53 in RQ 52 
are getting close to N, it transmits flow control information (Flow Control Token 
(FCT) 54) back to the transmitter 50 informing the transmitter 50 to cease 
transmission of cells 30. 

Whenever the receiver 51 again observes available buffers 53 in RQ 52, it 
transmits FCT 54 back to the transmitter 50 informing the transmitter 50 to re-en- 
able transmission of cells 30. 

The present invention requires the definition of the phrase 'Flow Group', 
which is as follows: 

• A Flow Group, at each point (hop) in the network, is a set of connections 
that have a common destination and a common channel thereto. 

• A Flow Group in a network is one end-node that has a unique address. 
This address is called the destination address. Each cell in the network 
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contains a destination address, so the routing elements in the network 
can route the cell to the correct destination. 

The fundamental concept in the method for virtual channel hop-by-hop flow 
control according to the present invention is that a flow group is only allowed to 
occupy a part of the N buffers in the receiver buffer RQ. Figure 6 illustrates the 
implementation of this concept in Figure 5. 

The inventive method requires each receiver to use a value (in Figure 6 
referred to as LimitRQ), indicating the number of buffers in RQ allowed for 
occupancy by one flow group. This value may be stored in a register. 

Referring to Figure 6, a cell 30 belonging to a flow group of destination ad- 
dress D is only allowed to occupy LimitRQ of the total number of buffers 53 in RQ 
52 at each hop. 

To achieve loss-less transmission with credit-based flow control and/or 
xon/xoff flow control, the minimum value of N and LimitRQ must be equal to the 
link peak throughput times the round-trip time. With retry-based flow control, the 
minimum value of N and LimitRQ is T, and lossless transmission is still main- 
tained. In any case, to allow full speed communication both the value of N and 
value of LimitRQ must be equal to the link peak throughput times the round-trip 
time. This is often referred to as the window size. 

In a practical embodiment of the present invention, the method requires 



that: 



Whenever the receiver observes that one flow group has occupied 
LimitRQ buffers in RQ, it transmits flow control information (Flow Control 
Token (FCT)) back to the transmitter informing the transmitter to cease 
transmission of cells within that flow group. 

Whenever the receiver observes that a flow group that previously occu- 
pied LimitRQ buffers in RQ, now occupies less than LimitRQ buffers in 
RQ, it transmits flow control information (Flow Control Token (FCT)) 
back to the transmitter informing the transmitter to re-enable transmis- 
sion of cells within that flow group. 
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In a practical embodiment of the present invention, the apparatus for virtual 
channel flow control may be implemented on top of the SCI link protocol (see [8]), 
and then uses a RAM-based RQ buffer architecture in the receiver Rx. The RAM 
is of size N wherein N is the number of buffers in the RAM. Each buffer can store 
one cell. In addition a CAM (Content Addressable Memory) also of size N, is used 
at the receiver. 

A detailed overview of a preferred embodiment of a receiver 51 is shown in 
figure 7. In Figure 7 there is one register called LimitRQ 55a. The value of this 
register 55b indicates how many buffers in the receiver queue (RQ) each flow 
group is allowed to occupy. More than one LimitRQ registers could also be 
applied, in case it is desired (in a particular implementation) to differentiate how 
many RQ buffers different flow groups are allowed to occupy. 

The invention does not require the use of a register of the type described 
above. However, a register is preferred because its content can be re- 
programmed.. The value of the LimitRQ register in Figure 7 is typically 
programmed once during system initialization and configuration. 

Each entry 57 in the CAM 56 contains a valid bit and the DestinationID of 
the corresponding buffer 53 in the RQ 52. The valid bit, if set, indicates that the 
corresponding buffer 53 in the RQ 52 is occupied by one cell. If the valid bit is not 
set, the corresponding buffer 53 in RQ 52 is free (i.e. not used). In figure 7, this is 
illustrated by arrows 58 pointing from a CAM entry and to the corresponding RQ 
buffer 53. Whenever the receiver receives a new cell, the cell is placed into a 
buffer 53 in RQ 52, and the DestinationID of the cell is copied into the CAM 56. 
The CAM 56 performs a lookup and compare on the DestinationID, to check if 
there are other cells with DestinationID D in the RQ. 

If there are other cells with DestinationID D in the RQ, the CAM checks 
whether the number of buffers in RQ with DestinationID D is less than the value of 
LimitRQ or equal to the value of LimitRQ. If the number of cells with DestinationID 
D in RQ is less than the value of LimitRQ, the cell is accepted (stored in RQ), and 
this information is forwarded to the receiver flow control processor DP 59, which 
sends a flow control token back to the transmitter 51 , informing the transmitter that 
the cell was accepted. 

If the number of cells with DestinationID D in RQ is equal to the value of 
LimitRQ, the cell is discarded. This information is forwarded to the receiver flow 



WO 01/67672 



4) 

1 ! 



PCT/NO01/00095 



control processor DP 59, which sends a flow control token back to the transmitter 
Tx, informing the transmitter that the cell was discarded and have to be retrans- 
mitted. A cell is also discarded if all the buffers in RQ 52 are occupied. 

A preferred embodiment of the transmitter 50 is illustrated in Figure 8. In 
Figure 8 there is a cell scheduler 60 at the transmitter. The cell scheduler is 
responsible for cell transmission and for providing a minimum of fairness between 
the flow groups to ensure forward progress for all flow groups. 

Cells 30 which are to be transmitted or have been transmitted, are stored in 
buffers 62 in a transmit queue (TQ) 61. A cell can only be removed from the TQ 61 
whenever the transmitter receives a flow control token (FCT) from the receiver in- 
forming the transmitter that a previously transmitted cell was successfully stored in 
the receiver RQ. 

If the transmitter receives a flow control token from the receiver informing 
the transmitter that a previously transmitted cell was discarded due to lack of 
buffers in the receiver RQ, the transmitter has to retransmit this cell. To ensure 
forward progress for this cell and avoid cell starvation effects, the cell scheduler 
should not transmit any other cell within the same flow group before the cell to be 
retransmitted is accepted by the receiver. 

The cell transmission algorithm used by the cell scheduler should be imple- 
mented in such manner that fairness between the various flow groups is main- 
tained. 

As an example of the present invention, consider the following: One RQ 
contains 16 buffers, each capable of storing one cell. The value of LimitRQ is 4 
buffers. If a flow group have consumed 4 buffers, that flow group is not allowed to 
occupy more buffer space. The remaining 12 buffers can be used by e.g. 12 cells 
from 12 different flow groups, 3 different flow groups occupying 4 buffers each, or 
any other combination. 

Figure 9 presents a system overview of a network where the present inven- 
tion has been implemented. In Figure 9, four switches, switch 81, switch 82, switch 
83 and switch 83 are connected together. Each switch contains four ports 89 (P1 T 
P2, P3, P4). Each port are bi-directional and contains one receiver with a receive 
queue 91 and one transmitter with one transmit queue 90. 

Node NO 85, node N1 86, node N2 87, node N3 88 in Figure 9 can be end 
nodes/switches/bridges/routers/etc. Node NO 85 is connected to port P0 of switch 



WO 01/67672 



12 



PCT/NO01/00095 



81. Node N1 86 is connected to port P1 of switch 81. Node N2 87 is connected to 
port P0 of switch 82. Node N3 87 is connected to port P1 of switch 81 . Ceils being 
sent from node N1 to node N3 traverse the path: port P1 to port P2 in switch 81 to 
port P0 to port P1 in switch 83 to port P2 to port P1 in switch 82. Ceils being sent 
from node NO to node N2 traverse the path: port PO to port P2 in switch 81 to port 
P0 to port P1 in switch 83 to port P2 to port PO in switch 82. Thus packets sent 
from node NO 85 to node N2 87 will use the same intermediate path through the 
switch fabric from switch 81 to switch 83 to switch 82 as packets sent from node 
N1 to node N3. If node N3 is subject to congestion, eventually transmit queue 90 
of port PO of switch 82, receive queue 91 of port P2 of switch 82, transmit queue 
90 of port P1 of switch 83, receive queue 91 of port PO of switch 83, transmit 
queue 90 of port P2 of switch 81 , and receive queue 91 of port P1 of switch 81 , 
will be filled up with cells going from node N1 86 to node N3 88. This means that 
cells going from node NO 85 to node N2 87 will not move forward at receive queue 
90 in port P0 in switch 81 , since the transmit queue in port P2 of switch 81 is full. 
Cells from node NO to node N2 can proceed without being blocked by the cells 
from node N1 to node N3, since these latter cells are only allowed to occupy one 
part (given by LimitRQ) of the transmit queue 90 of port P0 of switch 82, receive 
queue 91 of port P2 of switch 82, transmit queue 90 of port P1 of switch 83, 
receive queue 91 of port P0 of switch 83, transmit queue 90 of port P2 of switch 
81, and receive queue 91 of port P1 of switch 81. This also allows a more optimal 
use of the available buffer space as opposed to traditional VC solutions, in which a 
fixed part of the available buffer space is dedicated to each VC. Less buffer space 
is thus required in the present solution. 

As opposed to a prior art virtual channel flow control with dynamically sha- 
red memory at the receiver as described in [1], the receiver described above does 
not require a descriptor block per virtual channel. Hence, both the logic and buffer 
space needed is reduced. 

In case of network congestion, either as a result of a high-speed link going 
into a low-speed link, or as a result of a failured network component, both causing 
network buffers to be filled up, the present invention reduces the amount of head- 
of-line blocking locally and dynamically at each hop (switch point) in the network. 
The end result is increased performance, and improved network reliability. 
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Having described preferred embodiments of the invention, it will be appa- 
rent to those skilled in the art that other embodiments incorporating the concepts 
may be used. These and other examples of the invention illustrated above are 
intended by way of example only and the actual scope of the invention is to be 
determined from the following claims. 
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CLAIMS 

1 . A method for virtual channel flow control at the link level in a communication 
network, the network comprising at least one communication link having a trans- 
mitter end and a receiver end, a transmitter at the transmitter end for transmitting 
data cells over the communication link, a receiver at the receiver end for receiving 
the data cells transmitted over the communication link, the receiver including a plu- 
rality of buffers for storing the data cells, data cells with the same destination 
address belonging to a same flow group, wherein a flow group is only allowed to 
occupy a part of the available buffer space, the method comprising: 

- transmitting flow control information from the receiver to the transmitter, the 
flow control information comprising receiver buffer state information, and 

- using a data cell scheduler in the transmitter for taking appropriate action de- 
pending on the received flow control information, including ensuring transmis- 
sion fairness between the flow groups. 

2. Method according to claim 1 , comprising determining the available buffer 
space by using a content addressable memory (CAM) with N entries arranged in 
the receiver, each entry containing a valid bit and a destination address field of the 
corresponding buffer, the valid bit indicating whether the buffer is occupied and 
hence the validity of the destination address field. 

3. Method according to claim 2, wherein the number of buffers allowed for 
occupancy by each flow group is determined by at least one programmable 
register arranged in the receiver. 

4. Method according to claim 2, wherein the content addressable memory for- 
wards the information regarding available buffer space for a data cell to a flow con- 
trol processor arranged in the receiver, whereby the flow control processor trans- 
mits flow control information from the receiver to the transmitter. 
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5. An apparatus for virtual channel flow control at the link level in a communi- 
cation network, comprising 

- at least one communication link having a transmitter end and a receiver end, 

- a transmitter at the transmitter end for transmitting data cells over the com- 
munication iink, 

- a receiver at the receiver end for receiving the data cells transmitted over the 
communication link, the receiver including a plural.ty of buffers for storing the 
data cells, data cells with the same destination address belonging to a same 
flow group, wherein a flow group is only allowed to occupy a part of the avail- 
able buffer space, and 

- a data cell scheduler in the transmitter, the scheduler being operative to take 
appropriate action depending on received flow control information from the 
receiver and for providing transmission fairness between the various flow 
groups, wherein the flow control information comprises receiver buffer state 
information. 

6. Apparatus according to claim 5, wherein the communication links are point- 
to-point bi-directional communication links. 

7. Apparatus according to claim 5, wherein the receiver includes at least one 
programmable register, the value of the register reflecting/indicating the number of 
buffers each flow group is allowed to occupy. 

8. Apparatus according to claim 5, the receiver having N buffers, where each 
buffer can contain one cell, the receiver further comprising a content addressable 
memory (CAM) with N entries, each entry containing a valid bit and a destination 
address field of the corresponding buffer, the valid bit indicating whether the buffer 
is occupied and hence the validity of the destination address field. 

9. Apparatus according to claim 8, wherein the receiver further comprises a 
receiver flow control processor. 
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10. Use of the method according to claim 1 and the apparatus according to 
claim 5, for rate control of a high-performance link connected to a low-performance 
link. 

5 11. Use of the method according to claim 1 and the apparatus according to 
claim 5, for control of congestion resulting from failured network components. 
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